Create AI voices that understand emotional expression

Prompt to generate AI voices, change emotions, and more

Describe the desired AI voice's identity, voice qualities, and more

To help shape the voice we generate, input something distinctive this AI voice would say

Trusted By

Design Works Logo
Lge Logo
Woven Logo
Softbank Logo
Humana Logo
Aecho AI Logo
Betteryou Logo
Nestwork Logo
Innovax Systems Logo
Jammy Chat Logo
Aura Health Logo
Wonsulting Logo
Memorang Logo
Flourish Logo
Climb Together Logo
Design Works Logo
Lge Logo
Woven Logo
Softbank Logo
Humana Logo
Aecho AI Logo
Betteryou Logo
Nestwork Logo
Innovax Systems Logo
Jammy Chat Logo
Aura Health Logo
Wonsulting Logo
Memorang Logo
Flourish Logo
Climb Together Logo
Design Works Logo
Lge Logo
Woven Logo
Softbank Logo
Humana Logo
Aecho AI Logo
Betteryou Logo
Nestwork Logo
Innovax Systems Logo
Jammy Chat Logo
Aura Health Logo
Wonsulting Logo
Memorang Logo
Flourish Logo
Climb Together Logo
Sentra Logo
Althea Logo
Study Fetch Logo
Tone AI Logo
Thumos Logo
Ream Logo
New Computer Logo
Everfriends AI Logo
Mynd Logo
Pressmaster AI Logo
Nancy AI Logo
Parrot Prep Logo
Stimuler Logo
Quantanite Logo
Sentra Logo
Althea Logo
Study Fetch Logo
Tone AI Logo
Thumos Logo
Ream Logo
New Computer Logo
Everfriends AI Logo
Mynd Logo
Pressmaster AI Logo
Nancy AI Logo
Parrot Prep Logo
Stimuler Logo
Quantanite Logo
Sentra Logo
Althea Logo
Study Fetch Logo
Tone AI Logo
Thumos Logo
Ream Logo
New Computer Logo
Everfriends AI Logo
Mynd Logo
Pressmaster AI Logo
Nancy AI Logo
Parrot Prep Logo
Stimuler Logo
Quantanite Logo

A text-to-speech system that understands what it's saying

Octave (Omni-capable text and voice engine) isn't a traditional TTS model. It’s a voice-based LLM. That means it understands what words mean in context, so it can predict emotions, cadence, and more.

Create any voice you can imagine with Octave Voice Design

Create any AI voice you can imagine, like a "sarcastic medieval peasant," with a brief prompt or evocative script

"sarcastic medieval peasant"

Full prompt: "The speaker is a medieval peasant with a cockney accent, raspy voice, dripping with sarcasm."

00:00
00:00

"literature professor"

Full prompt: "A retired Black female literature professor who analyzes poetry with precise academic language and references to her own published criticism."

00:00
00:00

"charming cowboy"

Full prompt: "The speaker is a grizzled old cowboy with a folksy Texan drawl Southern accent, speaking in a charismatic tone with a deep but relaxed vibe."

00:00
00:00

"sitcom inner monologue"

Full prompt: "The star of a popular sitcom, with frequent inner monologues about her life."

00:00
00:00

"dungeon master"

Full prompt: "A know-it-all dungeons and dragons dungeon master speaking excitedly with a lisp."

00:00
00:00

"warm English narrator"

Full prompt: "The speaker is a sophisticated British female narrator with a gentle, warm voice, recounting the ending of a classic romance novel."

00:00
00:00

"unserious movie trailer guy"

Full prompt: "The speaker is an American, deep middle-aged male film trailer narrator for a film about chickens."

00:00
00:00

"raspy evil vampire"

Full prompt: "A villainous undead vampire, with a horrifying raspy voice, and a slight Transylvanian accent."

00:00
00:00

"reminiscing"

Full prompt: "A middle-aged African American man, reminiscing with a slightly gravelly voice and a tone of hard-earned wisdom."

00:00
00:00

"nature documentary narrator"

Full prompt: "The speaker is a distinguished British narrator, whose voice carries a deep sense of wisdom and curiosity."

00:00
00:00

"Texan fishing guru"

Prompt: "The speaker has a booming, charismatic radio voice, like a Texan fishing guru with a hint of gravel and an infectious laugh, perfect for reeling in listeners to 'Big Dicky's live fishing frenzy.'"

00:00
00:00

Generating the best AI voices has never been easier

In a blind comparison study with over 100 human raters, Octave’s outputs were favored over outputs from ElevenLabs Voice Design in terms of audio quality, naturalness, and how well speech generations matched descriptions of the desired voice, across 120 diverse prompts.

The first AI voice generator that can take nuanced Acting Instructions

As an LLM for voice, Octave can interpret your prompt and adjust its voice accordingly—from “angry” to “just above a whisper”

"whispering, hushed"

Here, we combine the text "Are you serious?" with the prompt "whispering, hushed."

00:00
00:00

“angry, furious"

With speaker and text held constant, we change the prompt to "angry, furious."

00:00
00:00

"calm, serene"

With speaker and text held constant, we change the prompt to "calm, serene."

00:00
00:00

“disgusted, disdainful”

With speaker and text held constant, we change the prompt to "disgusted, disdainful."

00:00
00:00

"pained, shocked"

With speaker and text held constant, we change the prompt to "pained, shocked."

00:00
00:00

Any emotion or speaking style, on command

Octave is the first TTS system that can take natural language instructions to change emotional delivery and speaking style. Give directions like "sound sarcastic" or "whisper fearfully." For the first time, creators have total control.

For creators and developers alike

Octave was built to generate the most expressive AI voices for any content: podcasts, voiceovers, audiobooks, and more. With our API, you can bring it to any application.

TTS Projects
Empathic Voice Interface (EVI)
Real-time interaction

Based on a new voice-to-voice AI model architecture, EVI 2 can converse rapidly and fluently. It understands the user’s tone of voice and generates an appropriate tone of voice automatically. It's capable of emulating a wide range of personalities, accents, and speaking styles. It can replace or integrate with other LLMs.

Explore EVI 2's capabilities

Compelling personalities (Aura) with EVI 2
"Hey Aura..."
Empathically expressive speech with EVI 2
"I’m launching something I'm excited about…"
Compelling personalities (Whimsy) with EVI 2
"Hey Whimsy..."
Rapping on command with EVI 2
"Can you freestyle rap about yourself?"
Prompting rate of speech with EVI 2
"Can you speak faster from now on?"
Nonverbal vocalizations with EVI 2
"Could you laugh maniacally for us?"
Inventing new vocal expressions with EVI 2
"Now can you make a sound of joy and enthusiasm?"
Emergent multilingual capabilities with EVI 2
"Can you speak Spanish?"
Compelling personalities (Stella) with EVI 2
"Hey Stella..."

00/00

Built for Developers
Interact with synthetic voices and personalities

Create an interactive personality for your use case with flexible prompting and voice modulation tools. We developed a novel voice modulation approach that allows anyone to adjust EVI 2’s base voices along a number of continuous scales, including femininity, nasality, pitch, and more.

Optimized for Human Well-Being
Build AI voices people can trust

EVI 2 excels at anticipating and adapting to users' preferences, made possible by its special training for emotional intelligence. Its pleasant and fun personality is a result of this deeper alignment with human values.

In deploying this technology, we require developers to adhere to the guidelines of The Hume Initiative, a non-profit that sets the first concrete guidelines for empathic AI.

Developer Resources

Platform Rounded

Developer Platform

Create your Hume account, get your API keys, monitor your usage, and explore our products in the interactive platform.

Visit the platform
Documentation Rounded

Developer Documentation

Explore our documentation with concise guides, hands-on tutorials, and an in-depth API reference—crafted to support your integration.

Explore the docs
Dev Community Var2

Developer Community

Join our community of developers and researchers working with Hume APIs—your go-to hub for collaboration, support, and knowledge sharing.

Join our community

00/00